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Abstract 

A model of mutation rate evolution for multiple loci under ar bitrary 
select ion is analyzed. Results are obtained using techniques from iKarlinI 
that overcome the weak selection constraints needed for tractability 
in prior studies of multilocus event models. 

A multivariate form of the reduction principle is found: reduction 
results at individual loci combine topologically to produce a surface of 
mutation rate alterations that are neutral for a new modifier allele. New 
mutation rates survive if and only if th ey fall below this surface — a 
generalization of the hyperplane found bv lZhivotovskv et all l|l994 ) for a 
multilocus recombination modifier. Increases in mutation rates at some 
loci may evolve if compensated for by decreases at other loci. The strength 
of selection on the modifier scales in proportion to the number of germline 
cell divisions, and increases with the number of loci affected. Loci that do 
not make a difference to marginal fitnesses at equilibrium are not subject 
to the reduction principle, and under fine tuning of mutation rates would 
be expected to have higher mutation rates than loci in mutation-selection 
balance. 

Other results include the nonexistence of 'viability analogous, Hardy- 
Weinberg' modifier polymorphisms under multiplicative mutation, and 
the sufficiency of average transmission rates to encapsulate the effect of 
modifier polymorphisms on the transmission of loci under selection. A 
conjecture is offered regarding situations, like recombination in the pres- 
ence of mutation, that exhibit departures from the reduction principle. 
Constraints for tractability are: tight linkage of all loci, initial fixation 
at the modifier locus, and mutation distributions comprising transition 
probabilities of reversible Markov chains. Q 
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1 Introduction 

Genetic systems have the same material basis as developmental and physio- 
logical systems — proteins, nucleotides, regulatory sequences, gene interaction 
networks, and self-organizing structures and activities in the cell and organism. 
For the evolution of genetic systems, however, the Darwinian paradigm of herita- 
ble variation for fitness runs into a complication: genetic variation for heredity 
can change the content of an organism's contribution to the next generation 
without necessarily changing the quantity, i.e. the organism's fitness. Genetic 
variation for heredity can therefore be selectively neutral yet still enter into the 
evolutionary dynamics or the population. Models of selectively neutral genes 
that modify genetic transmission — modifier genes — have been posed and 
analyzed in order to understand the evolutionary forces on the genetic system 
itself. 

The methodology of the neutral modifier model is to find out what effects a 
modifier allele must have on transmission in order to survive, as a function of 
the conditions of the population (such as the selection regime, existing genetic 
system, current genes in the population, population size, etc.). Since the modi- 
fier locus is assumed to have no intrinsic effect on fitness, its differential survival 
requires it become associated with alleles at loci under selection that have above 
average fit ness. This is called induced selection (also refer red to as 'secondary 
selection' ( Karlin and McGregor . 19726 : Kon drashovl 19951 )). The task of mod- 



ifier theory is to find out which effects on transmission cause a modifier allele 
to become associated with fitter genotypes. 

Any particular system can be simulated to find out the result, and a region 
of systems evaluated, but one cannot be sure how such results interpolate or 
extrapolate without analytical results. A 'complete theory' of modifier genes 
would be a complete classification of population conditions and modifier effects 
that would produce modifier allele survival. The current state of theory is 
far from this complete classification due to the limitations of mathematical 
techniques. Analytical results have been obtained only for models that are 
great simplifications of reality. The relevance of their results to real systems 
is justifiable only by the premise that the results extend beyond the simplified 
models into the space of real systems. One may argue that this premise is 
implicit in the use of all theoretical results from simplified models. 

This premise is always uncertain. To lessen the uncertainty, one would like 
to analyze models that are ever closer to reality. Modifier theory has a history 
of being extended to ever more realistic and general models. One result that 
has reappeared throughout this sequence of greater realism is the Reduction 
Principle: that population near equilibrium under a balance of selection and 
transformation processes will evolve in the direction of reduced rates of those 
transformation processes. 

Modifier models exhibiting the reduction principle have mostly shared one 
glaring departure from reality: that only a single transforming event during 
reproduction occurs for the genes under selection — i.e. a single mutation, or 
single crossover. In reality, multiple transformation events are the rule during 
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reproduction. This paper takes one step toward greater reality by modeling 
modifiers of multiple events. 



1.1 The Reduction Principle 

In the first analyses of genetic modifiers of mutation, recombination, and mi- 
gration in the 1970s, a common result kept appearing, which was that only 
reduced levels of mutation, recombination, or migration could evolve when pop- 
ulations were nea r equilibrium un der a balance between the forces of selection 



and transmission. iFeldmanI (|1972I ) discovered the first example of this analytical 
reduction result for modifiers of recombination between two loci with two alle- 
les under viability selection, for multiplicative and symmetric viability regimes. 
Subsequent studies extended the reduction result to larger and larger spaces 



rates, and arbitrary viability selection regimes ( 


Karlin and McGreeoil 19726 


Feldman and Balkaul 1973t Balkan and Feldman, 


1973; Karlin and McGreEor 


1974; .Feldman and Krakauer. 1976: Teaeue. 


1977 


Feldman et al.. 19801). 



It so happened that during this same time period, on a seemingly unrelated 
topic — ho w popu l ation subdivision would affect the maintenance of genetic 
variation — Karlin (|l976l[r982.1 developed two general theorems on the spectral 
radius of perturbations of migration-selection systems. These theorems show 
how, for two different kinds of variation in migration, a greater level of 'mixing' 
lowers the spectral radius of the stability matrix for the system, and reduces 
the number of alleles that exi st as p r otecte d polymorphisms. The theorems 
first appear, without proof, in Karlin (1976', pp. 642-647), and with proof as 
Theorems 5.1 and 5.2 in .Karlin (19821 

The 'mixing' that occurs in migration is dynamically anal ogous to th e 'mix- 
ing' of genetic information that occurs during reproduction. lAltenbergl (|l984f ) 
found that Karlin's Theorem 5.2 applied to the form of variation modeled in the 
literature that exhibited the reduction result, through use of a general represen- 
tation of genetic transmission, which hides the details of the genetic system but 
makes explicit the form that variation in transmission takes. Because of the gen- 
erality of Theorem 5.2, its applicability meant that the reduction result could be 
extended to arbitrary genetic systems and processes being modified (for which 
recombination, mutation, and migration are special cases), arbitrary numbers 
of alleles and loci, and arbitrary selection regimes — a level of generality not 
often attainable in population genetics theory. 

However, the tradeoff for this generality is the very specific way that the 
modifier gene must vary genetic transmission in order for Theorem 5.2 to ap- 
ply: the modifier gene must scale equally all transmission probabi lities between 
different genotypes. This is referred to as linear variation (.Altenberg 11984 ; 
Altenberg and Feldmanlll987() . Linear variation has the form: 



T{i'!-j,k) = a P{i^j, k) for j, k i, 



where T{i-<^j,k) is the probability that parental haplotypes j and k produce 
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a gamete haplotype i, and a is the parameter controlled by the modifier gene 
that scales the transmission rates P{i^j,k). 

Subsequent studies of the reduction principle for linear v ariation include 

Feldman and Liberman J 1986 ^ Liberman and Feldma n (1986&!) , iLiberman and Feldman 
(|l986Q[ ). lAltenberg and FeldmanI (|l987l ). and lAltenberg, (,2009l ). 

What linear variation means, in biological terms, is that during reproduc- 
tion, the genotype is 'hit' only once by the transformation processes, and the 
probability that this hit occurs is what is controlled by the modifier gene. Be- 
ing hit only once, however, is manifestly unrealistic because reproduction almost 
universally exhibits multiple independent transformation events, including mul- 
tiple mutations, crossovers, chromosomal reassortment, transpositions, and their 
combinations. 

The literature has explored the realm of multiple-hit genetic transformation 
models to a very limited extent, but even here, important phenomena have been 
discovered. These studies can be classified into two categories: 

1. models of two mixed processes, where the modifier gene controls one trans- 
formation process, but a second, different transformation process occurs 
outside of its control; and 

2. models of a single process that can occur multiple times among different 
loci; in this case, the models are all of multi- locus recombination modifi- 
cation ( Zhivotovskv et al. . 19941 : Zhivotovskv and Feldman . 19951 ). 



1.1.1 Mixed Processes 



Mixed processes are notable in that they are where departures from the re- 
duction principle are found in near-equilibrium populations. The mixed pro- 
cess of greatest interest has been recombination in the presence of mutation 
( Feldman et"al1 . ll98d:[Charlesworthlll990HOtto and Feldmanl - liggTLlPvlkov et al 
1998[ ). and the departures from the reduction result are t he basis of the 'de 



term inistic mutation hypothesis' for the evolution of sex (jKondrashovL 11982 . 
19841) . Other mixed proce sses studied include: the evolution of recombination in 



the p resence of migration ( Charlesworth and Charlesworth , I1979I: iPvlkov et al 



19981 ). or segregation and syngamy (which self-fertilizatio i i expos es in the recur- 
sion) ( Charlesworth et al. . 19791 : Holsinger and Feldman . 1983q ): or models of 
the evolution of multiple mutation processes ( AltenberS ^ 1984L pp. 137-151), or 
mutation in the presence o f segregation and syngamy (als o exposed in the re- 
cursion by self-fertiliza tion (jHolsinger and FeldmanI . 1198361 ) or fertility selection 
19861) 1. 



( Holsinger et al 



The departures from the reduction principle caused by mixed processes are 
summarized by the 'principle of partial control': when the modifier gene has 
only partial control over the transformation occurring at loci under selection, 
then it ma.y be p ossible for the part it controls to evolve an increase in rates 
(lAltenberd . [l98l pp. 149, 225-228). 
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1.1.2 Multiple Hit Processes 

In the majority of these models of mixed processes, the process controlled by 
the modifier gene still occurs as a single event during reproduction. Multiple 
events under modifier control are studie d in a model of recombinati o n bet ween 
multiple loci in Zhiyotovskv et al. ( 1994 ). Zhivotovskv and Feldman ( 19951 ) and 



IPvlkov et all (|l998f ). By assuming fitness differences near zero (i.e. weak selec- 
tion), two alleles per locus, and an unlinked modifier locus, these studies obtain 
analytic results for a modifier gene that has arbitrary control over recombination 
distributions that incl ude multiple recon i binati on events. 



The main result in IZhivotovskv et al.l (jl994^ is that they find a more sophis 



ticated reduction principle at work: a new modifier allele will increase when 
rare if and only if it reduces a certain weighted sum of recombination prob- 
abilities. Notably, particular recombination events may evolve an increase in 
rates, as long as the weighted sum is decreased. This more complex result is 
distinguished by the term 'generalized reduction principle'. 

1.1.3 Multiple Hit Processes Under Strong Selection 

Can the constraints of weak selection and two alleles per locus in these prior 
studies be droppe d? Tha t is the aim of this paper. Techniques from the proof 
of Theorem 5.1 in iKarliii (jl982) allow one to obtain analytic results for a more 



general modifier model with: 

1. multiple loci under selection, 

2. multiple alleles at those loci, 

3. arbitrary viability selection regimes of any strength, 

4. arbitrary control over the rates of the multiple events, and 

5. arbitrary numbers of cell divisions from zygote to gamete. 

The latter generalization — to multiple cell divisions in the gamete line — is 
novel to this study; multiple cell divisions fundamentally rule out models with 
linear variation, and require multiple-hit theory. 

To use these techniques, however, a different set of constraints is needed: 

1. the population begins fixed at the modifier locus, 

2. the only type of event is mutation, 

3. mutation events occur at each locus independently, 

4. the mutation rates at each locus are scaled equally by the modifier locus, 

5. mutation distributions must have the form of transition probabilities of 
reversible Markov chains, and 

6. no other genetic processes occur, including recombination. 
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This last constraint — an absence of recombination — produces the greatest 
distance from realism in this model. The distance of a constraint from phys- 
ical reality, however, may not reflect its distance mathematically from tech- 
ni ques that make it unn ecessary. This was the case wit h the reduction results 
in lAltenberd ( 19841 ) and Altenberg and FeldmanI ( 1987 ). whose proofs depend 
on the assumption that no recombination occurs between the modifler locus and 
the loci under selection; addition of a single mathematical tech nique allowed thi s 
constraint to be dropped in the proof of the reduction result (|Altenberg| . l2009f ). 
The present results, which require an absence of recombination, are similarly 
presented with the hope that future developments will allow this constraint to 
be removed. 

The modifier gene here is assumed to produce linear variation for single loci, 
but multiple independent events occur over multiple loci. In other words, the 
modifier gene scales equally the mutation probabilities between all alleles at 
each single locus, but the probability of multiple mutations is the product of 
the probabilities of the single mutations. Furthermore, the modifier is allowed 
arbitrary control over the mutation rate parameter for each locus. 

Under these assumptions, one loses the use of Karlin's Theorem 5.2, since 
it is impossible generically for the modifier gene to produce linear variation in 
transmission. However, Karlin has another theorem — 5.1 — which applies to 
a different form of variation that has many more degrees of freedom (see (fT4|) ). 
And, as it turns out, the form of variation treated in Karlin's Theorem 5.1 is 
perfectly suited to multiple event models. 

Karlin's Theorem 5.1 — which does not appear to have been utilized in 
the literature since its original publication (jKarlin , Il982[) — here comes into its 
own. Theorem 5.1 considers stochastic matrices Karlin defines as symmetrizable, 
which affords use of the Rayleigh-Ritz variational characterization of the spectral 
radius. While Karlin does not mention it, symmetrizable stochastic matrices are 
one and the same as transition matrices for time-reversible Markov chains (see 
Lemma [2|) , wh ich are assumed for most models of mutation in phylogenetic 
reconstruction ( Squartini and Arndt . 2008[ )). In an earlier version of this paper, 
I used Theorem 5.1 directly, but with further consideration it turn s out that the 
critical tools needed are actually certain steps in Karlin's proof (jKarlinl Il982 . 
pp. 114-116, 197-198). 

Application of these tools to this multiple- hit model yields — unsurprisingly — 
the reduction result. Moreover , the result has the form of the 'generalized 
reduction principle' delineated bv IZhivotovskv et al.l ( 19941 ). in that mutation 
rates can increase at some loci provided that mutation rates decrease sufficiently 
at other loci. 

Th e weighted average of the mutation rates found by 'Zhiv otovskv et ahl 
( 1994h lio be the criterion for the initial increase of the modifier is shown here 
to actually be the linear limit of a larger object: namely, a smooth manifold of 
mutation rates that divides the space of mutation rates into those that will cause 
a modifier to invade, and those that will cause it to go extinct. The existence of 
this manifold is found to be a topological necessity from the single-locus reduc- 
tion result, shown using the Intermediate Value Theorem and Implicit Function 
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Theorem. 

For clarity, the main resuhs of the paper from Section 14.21 arc previewed 
here: 

Main Results (Multivariate Reduction Principle for Symmetrizable Muta- 
tion Rates at Multiple Loci). Consider a genetic system in which a modifier 
locus controls the mutation rates of a group of loci under viability selection. 
Mutations occur independently among the loci under selection. In a population 
near equilibrium under a stable mutation- selection balance, fixed at the modifier 
locus, let a new allele of the modifier locus be introduced. The new modifier al- 
lele can change the mutation rate parameter separately for each locus, and each 
parameter scales equally the probability of mutations at that locus. 
Under the following constraints: 

1. mutation rates at each locus range between and 1/2, 

2. no recombination or other transformation process acts on the genes, 

3. the mutation matrix for each locus is irreducible, and 

4-. is the transition matrix for some reversible Markov chain, 

then the new modifier allele will increase (decrease) in frequency at a geometric 
rate if, among the loci that affect the marginal fitnesses: 

1. it reduces (increases) the mutation rate at any locus, and does not increase 
(decrease) the mutation rates at any locus; 

2. it increases the mutation rates for at least one locus, and decreases the 
mutation rates for at least one locus, and falls below (above) the neutral 
manifold of mutation rates that includes the mutation rates at the equilib- 
rium. Should the mutation rates produced by the new modifier allele fall 
on this neutral manifold, then it will not change frequency at a geometric 
rate. 

Moreover, the further that the new set of mutation rates is from the neutral 
manifold, the stronger is the eventual induced selection for (against) the new 
modifier allele, up to a maximum fitness of max^ Wi /W for a modifier allele that 
eliminates all mutation. 

These results hold, in the case of multicellular organisms, for arbitrary num- 
bers of cell divisions between gamete generations. The strength of selection on 
the modifier locus scales in proportion to the number of cell divisions in the 
germline, and increases with the number of loci controlled by the modifier. 

The paper proceeds with an introduction to the general modifier gene model, 
followed by development of mathematical tools that will be used, key theorems, 
and finally their application to the modifier model. It concludes with a discus- 
sion of the particular implications of the main results, a discussion on the nature 
of models that depart from the reduction principle, and a conjecture about de- 
partures from the reduction principle that embodies the proposed explanation 
and can readily be tested. 
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2 The General Evolutionary Model 



centlv in detail in lAltenberd ( 20091 ) 



I give a con densed exposition of the g e neral modifier model developed inlAltenbere 
(Il984l) and lAltenberg and FeldmanI (Il987l). used in IZhivotovskv et all (|l994l ) 



Zhivotovskv and Feldmanl"( 1995 ). and Pvlkov et al. ( 1998h . and described re- 



The genome is structured in two parts: a group of loci experiencing natural 
selection, and, external to the group, a neutral locus that modifies their genetic 
transmission probabilities. The model assumes an infinite population, random 
mating, non-overlapping generations, frequency-independent viability selection, 
sex symmetry, and no sex-linkage. Although selection acts on diploid genotypes, 
the haplotype frequencies become dynamically sufficient state variables under 
random mating. Haplotypes have two indices: one for the haplotype of the loci 
under selection (i, j, fc), and one for the allele at the modifier locus (a, &, c). The 
modifier allele is assumed to be transmitted without alteration and in Mendelian 
proportions (no mutation nor segregation distortion), so that the only force 
acting upon it is from associations it forms with the loci under selection. 

The recursion on the frequency of haplotypes from one generation to the 
next is: 

w 4i = ^Tf^rjiai^ajlbk) Wjk Zaj z^k (1) 

bjk 

where 

Zai is the frequency of the haplotype with allele a at the modifier locus, and 
haplotype i at the loci under selection; z'^^^ is the next generation; 

Wjk = Wkj is the fitness of diploid genotype jk for the loci under selection; 
w := Wjk Zaj Zbk is the mean fitness of the population. 

abjk 

Tab is the probability of recombination between the modifier locus and the near- 
est locus under selection, 

T(r)(o* is the probability that parental haplotypes aj and bk produce 

an offspring haplotype ai, conditioned on the modifier allele of the offspring 
being a: 

T(^r){0''i^o,j\bk) := (1 — rab)T{ai^aj\bk) + rabT'^ {ai^ak\bj), 

where the probability that parental genotype aj, bk produces gamete hap- 
lotype ai is: 

T(ai-«— aj'l&fc), when no recombination occurs between the modifier and nearest 
locus under selection, and 

T^(ai-^afc|6j), when recombination occurs between the modifier and nearest 
locus under selection (hence aj\bk becomes ak\bj). If there is no position 
effect from the modifier locus, then T{ai-<^aj\bk) = T^{ai<—aj\bk). 
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So, l = Y.iT{ai^aj\bk) = J2iT'''{ai-i^ak\bj), ya,b,j,k. 

At this point it becomes appropriate to point out a fundamental property 
of genetic transmission: 

Result 1 (Sufficiency of the Mean Transmission Probabilities). The transmis- 
sion probabilities enter into the dynamics of the haplotype frequencies of the loci 
under selection solely through their population averages, regardless of any form 
of underlying genetic variation for the transmission probabilities. 

Proof. Let 

a 

represent the frequency of haplotype i comprising the loci under selection. The 
population average of the transmission probabilities experienced by the loci 
under selection is: 

rf, J2abT{r){ai^aj\bk) Zaj Zbk 

= '^T(^r)iai-^aj\bk) Zaj Zbk- (2) 

ab 

The recursion on Vi is thus: 

w = ^ = ^ T^r){ai^aj\bk) wjk Zaj Zbk 

a abjk 

= '^T(T){i^i\k) Wjk v-j Vk. (3) 

jk 

Hence, any modifier polymorphism enters the dynamics of Vi solely through the 
population mean r(r)(*^i I fc)- D 

The mean transmission probabilities T(^r) (* i I fc) thus behave like a sufficient 
statistic, in that no additional details about Zgj or TMjai^ aj\bk) matter to the 
value of v^. Hence T(r){i<—j\k) screens off (,Salmonl . ll97l Il984l: iBrandonl . Il982l) 



any details of polymorphisms of the modifier locus, such as allele frequencies or 
linkage disequilibrium. 

It should be noted that ([3]) cannot be used to define the dynamics, because 
T(^r){i ^ is itself subject to change that is not defina ble in terms of I f,;}. 
Hence the {zai} are dynamically sufficient state variables ( Lewontin . 1974 , pp. 
6-8), while {vi} are not. 

2.1 Equilibrium Relations 

A population at equilibrium under ([T]) must satisfy the constraint for each b: 
w Zbi^^ T(r) {bi <- bj\ck) wjk ho hk, (4) 

cjk 
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where " indicates the marked variable is at an equilibrium value. In vector form: 

Zfc = M(b) D zb, (5) 

where 

Zfc := {zbi Zb2 ■ ■ ■ Zbn)^ (t is the transpose), 

n is the number of different haplotypes for the group of loci under selection, 
D := diag Wi/w , 

L J i,j=l 

- ^ ZckWjk, and 



w 



ck 



M 



(b) 



^T(^r){bi^bj\ck) 



Wjk 



Zck 



Note that D a non-negative diagonal matrix, and M is a (column) stochastic 
matrix, since r(-r)(6i<— &j|cfc) = 1 for all 6, c, j, fc, hence 



W]k Zck _ _ 



J2[M^b)h = J2J2Tir)ib^^bj\ck) -f- Zck = E 

I i ck ck •' 

A perturbation of the equilibrium to zu = zu + tbi produces: 



w + 2 ^ ebjWjkZck + ^ ebjtck {hi + efcj) (6) 

bjck bjck j 

cjk 

The system © is assumed to be stable to internal perturbations, i.e. for per- 
turbations where 7^ only for &, i \ Zbi > 0. 



2.2 Initial Increase of a New Modifier Allele 

The long-term evolution of genetic transmission depends on the properties that 
allow a new modifier allele to invade a population and be protected from ex- 
tinction. Hence the analysis focuses on perturbations of the equilibrium by rare 
modifier alleles, entailing Zai = for all i for new modifier allele a. Making this 
substitution, and ignoring all second and higher order terms in the perturba- 
tion, the linear recursion on a new modifier allele, a, that perturbs @ can be 
represented in vector form as: 



= M(,) D ea , 



(7) 
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where 



and 



M 



(a) 



(1-r) 



T/r){ai^aj\bk)^^Zbk 

L bk -"J ij = l 



(8) 



(9) 



T(aii—aj\bk) — — z^fe 

Wo 

.6*: ^ . 



VT^(az<-afc|6j)— -Sbfc 

. bk 



Modifier allele a will increase at a geometric rate when rare if and only if the 
spectral radius p(M(„-)D) exceeds 1, and will decrease at a geometric rate when 
rare if and only if the spectral radius p(M(a)D) is less than 1. Clearly, if D = I, 
then p(M(q)D) = /9(M(q)) = 1, so geometric rates of change in modifier allele 
frequencies requir e D 7^ I, a situa tion described by s aying there i s a pos itive 
selec tion yotential (|Altenbergll984L "fitness load" p. 63: lAltenberg and FeldmanI 
1983): 

max, Wj 

V — - 1 > 0. (10) 



]V[(f,')D Zf,, provided >^ 



w 

We know from ([5]) that p(M(b)D) = 1, since Zf, 
is the only nonnegative eigenvector of M(6)D. 

The analysis consists of evaluating how the relationship between ^(a) ^-nd 
the matrices {M(h)} maps to the relationship between p(M(jj)D) and p(M(b)D) = 
1. 



2.3 Constraints for Tractability 

Evaluating how the relationship between M(a) and the matrices {M(f,)} affects 
p(M(£()D) is, in general, difficult. The addition of three constraints makes it 
tractable: 

1. Mutation is the only transformation process acting on the loci under se- 
lection; 

2. the modifier locus is fixed on a single allele in the initial population; and 

3. the modifier locus is tightly linked to the loci under selection. 

Mutation. In mutation, the products from transformation of a haplotype 
depend on that haplotype alone, not on the haplotype from the other parent, 
so r(r)(ai^aj|6A:) can be simplified to T(r)(ai<— aj|6), and (jll) becomes: 



M 



(a) 



. bk 



T(r) (ai 4- aj \ b)^—Zbk 



= (l-r) 



Zbk 



L bk 



E 

bk 



T'^{ai^ak\b)^Zbk 

Wi 
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Fixation of the Modifier Locus. The sum over k involves only the terms 
WjkZbk, and can be made to cancel out the wj term if the initial population is 
fixed on a single modifier allele b, since in that case 



WjkZbk — ^ Wjkhk — 



Uj, 



bk 



and therefore 



M 



(a) 



(1-r) 
(1 



T{m^aj\b)^ 



+ r 



yT''{ai^ak\b)^Zbk 

Wi 



L k 



r) T„ 



-rT:,D,,WD-Vw, 



(11) 



where 

W := [wij]^ -^-^ is the matrix of fitness coefficients; 

Dzj^ represents a diagonal matrix whose diagonal entries are the entries of the 
vector ib', 



^ab 



ab 



T{ai-^aj\b) 
T'^{ai<-aj\b) 



and 



The matrices Tab and T^^ do not depend on either the selection coefficients or 
the haplotype frequencies, which is a great simplification of M^^i). Additional, 
more compelling, reasons to fix the initial population on a single modifier al- 
lele arise from the structure of the transmission matrix, described in Section 
12.4.11 Hence, fixation of the initial population on modifier allele b is assumed 
throughout the remainder of the paper. 

No Recom bination with the Modifier Locus. The analysis here follows 
Karlin's (|l982l p. 198) appli cation of the Ray leigh-Ritz var iational characteri- 
zatio n of the spectral radius (Wilkinson 1965 . pp. 172-173. iHorn and Johnson 
11985", pp. 176-180), which requires that ^(a) be symmetrizable — i.e. of the 
form LSR, where L and R are positive diagonal matrices, and S is a real sym- 
metric matrix. 

For tight linkage of the modifier gene to the loci under selection, r = 0, so 
(|11[) becomes simply Mj^^ = Tab, and symmetrizable Tab are readily defined. 
This is treated in Section [3] 

However, for looser linkage, r > 0, a key step in the analysis is blocked (see 
footnote [3] in the proof of Theorem [2|). When r > 0, the term 



w 



.D,,W D 1 



precludes finding families of M(a) that are generically symmetrizable. For, sup- 
pose that 



M 



(a) 



(1 - r) Tab + rT^^ Dj,W D'V w = LSR, 
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Then 

S = (1 - r) L-iT,faR-i + r L'^T^, D^, W D-^R-V %. 

But if tliis expression is to be symmetric for any r, then both L^^TabR^^ and 
■j^-irjiM^ Dzj^W D^^R^^ must be symmetric. The first term requires Tab be 
symmetrizable, as before. But the second term, to be symmetric, forces muta- 
tion rate matrix T^j^ to depend on the equihbrium haplotype frequencies Zbi, 
and the marginal fitnesses Wi, which is contrary to the biological basis of muta- 
tion rates, non-generic, and not useful for understanding the selective forces on 
mutation rates. 

Hence, for the remainder of the analysis, it is assumed that there is tight 
linkage between the modifier and the loci under selection, that mutation is the 
sole transformation process, and the initial population begins fixed on modifier 
allele b. Relaxation of each of these constraints would be a goal for future 
analytical methods. 



2.4 Multilocus Mutation Structure 

The biology of mutation provides a natural structure for multiple events. Each 
nucleotide is a locus for a possible mutation event. And in multicellular organ- 
isms, each cell division in the gamete lineage provides opportunities for mutation 
events (ranging from approx i matel y 9 cell divisions in nematodes, 36 in flies, to 
200 in humans (jLvnch et all . l2008l) ). 

Assuming that mutations occur independently at each nucleotide, and in- 
dependently from one cell division to the next, the the probability of multi- 
ple events is just the product of the probabilities of each event individually. 
Thi s is a standard assumption in many ph ylogenet ic i nference models (e.g. 
see 



3 IS a stanaard assumption m many pn yiogenet ic m ierence moaeis [e.g. 
Yang and NielsenI l2002i IWhelan and Goldman ,20ol . The modifier gene 



is posited to rescale equally the probabilities of all single events at each locus. 
So the modifier gene could be said to produce linear variation at each single 
locus, but not over the entire haplotype. Multiple cell divisions between gamete 
generations are represented in the dynamics simply as multiple powers of the 
mutation matrix. 

Multiple non-independent mutation events do happen in nature, however. A 
mutational event may involve multiple nucleotides. To decompose the probabil- 
ities in this case would require nested sums and products of transition matrices 
(an ex ample with dinucleotide dependencies is modeled bv lSauartini and Arndd 
( 2008h m the context of phylogenetic processes), and will not be pursued here. 



The implementation of these assumptions is as follows. Let: 

L be the number of loci under selection, and ^, k G {1, . . . , L} index the loci; 

fi be an L-long vector of mutation rates, one rate fi^ for each locus ^, whose 
values are controlled by the modifier gene; 

^J,^P^f^ be the probability of mutation from allele j to allele i at locus ^; 
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p(«) 



be the x i/^ transition matrix representing the mutation 
distribution at locus ^; 



i'^ be the number of possible alleles at locus and together, 

M.'jj^ := (1 +^^P(«) = +Mc(P'^' -I'^^) is the i/^ x ly^ transmission 

matrix for alleles at locus 

Under the assumption that each locus mutates independently of the other 
loci, the transmission matrix for the entire space of haplotypes of loci under 
selection is represented by the Kronecker (tensor) product ((8)): 

L L 

= (g)M(«)=(g)[(l-Mc)l(«)+McP^«^]. (12) 

I will use the terms multivariate, multiplicative variation to refer to the way 
that p2p maps variation in fi to variation in M^. 

2.4.1 Consequences for Modifier Polymorphisms 

Multivariate, multiplicative variation does not allow for the elegant "viabil- 
ity analogous. Hardy- Weinberg" (VAHW) equilibria, z = y (g) v, that arise in 

modifie r models with linear variation (y is the frequ ency vector o f the modifier 

alleles') ("Feldman and Krakauer'l976'. Altenbcrg 1984. pp. 130-169, Feldman and Liberman 



[l98fi. Liberman and Feldman, 1986 6, , Liberman and Feldnm]ll986X 



In VAHW equilibria, the parameter controlled by the modifier allele behaves 
as if it were a viability fitness coefficient (one minus that parameter, actually). 
The transmission probabilities have the parameterized form: 

T{ai^aj\bk) = Ta^^{i^j\k), 

where the modifier locus genotype (a, b) enters solely through the parameter 
aab- The VAHW structure requires that population averages of the transmission 
rates, as in be expressible as Taii-^jlk) for some a. This is possible if and 
only if the space of variation in transmission is convex. 

But convexity no longer holds for multiplicative variation. For one-locus 
mutation with linear variation, the convexity of the space is seen by its form: 

M :={(l-/i)I + /iP: A^e (0,1/2)}, 

where, for a set {p^: pi > 0,J2iPi ~ l}i o^^^ has 

^PiM^. = Mjr, where Jl := ^Pifii- 

i i 

For multivariate, multiplicative variation, the space of variation is, 
M := {(g)[(l - + ^eP(«)] : M« e (0, 1/2)} 
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If fi^ varies for more than one ^, is no longer a convex set, so population 
averages of over different fi are not of the form M^j, for any rj. 

This can be seen in the simplest case where L = 2. Let there be two set of 
different mutation rates for each locus, and take their weighted average using 
p,l — p, < p < 1. Now, suppose there is /I such that: 

M^-pM^^ + (l-p)M^^ 

+ (1 - pm /.W)l(i) + ^Wp(i)] 8 [(1 - M^')!'^) + M^^P^^'] 

(here and fi'^^ refers to the two mutation rates at locus 1). Equating 
coefficients on each matrix term, 

(1 - 7i(i))(i - 7i(^)) - p(i - - ^) + (1 - - - /4'^ 

(1 - 7l(i))7l(2) = p(i _ + (1 _ _ and 

The result of adding the last two equations, and adding the second and last 
equations: 

^ ) + (1 - 71'^) = PM^' + (1 - P)^^^ 

gives: 

which requires either /i^^^ = /ij^"*, or ^^^^ — ^^2^ which leaves only one locus with 
mutation rate variation, or p = 0, or p = 1, which is fixation of the modifier. 

Thus, when the modifier locus is polymorphic and varies the mutation rates 
at more than one locus, averages over the modifier alleles can no longer be 
summarized by the averages of the mutation rate parameters, but instead yield 
mean transmission matrices ([2]) that fall outside of A4. Hence, the relation 
between M(a) and the matrices {M(b)} in ([5]) is not simple. 

The analysis of modifier polymorphisms for multivariate, multiplicative vari- 
ation in transmission will require techniques that can handle more general spaces 
of variation in transmission, a topic left for another study. 

3 Mathematical Tools 



The analysis here is made possible with the techniques used in Theorem 5.1 of 
KarlinI (| 19821 pp. 114-116, 197-198). The theorem is restated as follows: 
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Definition 1 (Symmetrizable Matrices). A square, real matrix A is called sym- 
metrizable to a symmetric real matrix S if it can be represented as a product 
A = LSR, where L and R are positive diagonal matrices. 

Theorem 1. \KarliA [Tgil . Theorem 5.1, pp. II4-II6, 197-198). Consider 
a family of stochastic matrices that commute and are symmetrizable to positive 
definite matrices: 

■.= {Mh = LShR:MhMk=MkMh}, (13) 

where L and R are positive diagonal matrices, and each Sh is a positive definite 
symmetric real matrix. Let T> be a positive diagonal matrix. Thei^ for each 

p(Mh.MfcD) < p(MfcD). (14) 

Karlin's proof uses a specially crafted inner product, but here I utilize a 
canonical form for symmetrizable matrices: 

Lemma 1 (Canonical Form for Symmetrizable Matrices). A symmetrizable 
matrix A — LSR can always be represented by a single positive diagonal matrix, 
B, and a symmetric matrix, S, that has the same spectrum as A; 

A = LSR = BSB-\ (15) 

where 

B = R-i/2c (16) 

with c > any scalar, and 

S = Ri/2 S rV2. (17) 

Furthermore, the Jordan canonical form, S — KAK^, with orthogonal ma- 
trix K, and real diagonal matrix A of the eigenvalues of S and A, provides a 
canonical form for symmetrizable A; 

A = LSR = BKAK^B 1. (18) 

BK is the matrix of right eigenvectors of A (columns), and K^B~^ is the 
matrix of left eigenvectors of A. (rows). B can be made unique by setting c to a 
normalizer c — minJL^^/^R-'^/^J.ji which yields p(B) — 1. 

Proof. Verifying by substitution: 

BSB-i = (L1/2R-1/2) l1/2r1/2sl1/2r1/2 (L-1/2R1/2) = LgR. 

L^/^ and R^/^ exist because L and R are positive diagonal matrices, and R and 
L (and their powers) commute because they are diagonal matrices. S is sym- 
metric by the symmetric form of L^/^ R^/^ S L^/^ TV-^"^, so its Jordan canonical 



^The version in lKarlinI ||1976| , pp. 642-647) is JTUl, but lKarlinl dl982l . Theorem 5.1, p. 116) 
states strict inequality, although the proof, pp. 197—198, does not exclude equahty. Strict 
inequality holds provided all are irreducible. 
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form , S = KAK ^, has orthogonal K, and real diagonal A (jHorn and Johnson . 
19851 Theorem 4.1.5, p. 171) 



Since A(BK) BKAK^R-iBK = (BK)A, it can be seen that the jth 
column of BK is a right eigenvector associated with eigenvalue [A]jj. Similarly, 
(KTb-I)A = K^B-iBKAK^B-i = A(KTb-I), so the ith row of K^B'^ 
is a left eigenvector with eigenvalue [Aju. 
Setting 

B = mm[L-'/'R'/%, L^^R-i/^ ^ } L^^R-i/^ (19) 

i 

gives a unique B normalized so that maxi[B]ii ==1. □ 

The symmetrizable stochastic matrices considered here have th e same canon- 
ical fo rm as the transition matrices of reversible Markov chains ( Keilson I1979I 



p. 33, Ababneh et al.l2006 . p. 296). One may ask whether they are one and the 



same. Indeed they are. An ergodic Markov chain is reversible if its transition 
matrix M is irreducible and obeys: 

M = (M D^)^ ^ (20) 

where Mtt = tt, stationary distribution of the chain, and the Perron vector 
of M, which refers to the eigenvector associated with the eigenvalue of largest 
modulus, the Perro n root 1. is the diagonal matrix of the entries of tt ffFelleil 
I971I pp. 414-415: Ilo"sifesculll980l pp. 143-145). Hence this follows: 



Lemma 2 (Reversible Markov Chains). An irreducible stochastic matrix is of 
the form M — LSR, with L and R positive diagonal matrices and S a symmetric 
matrix, if and only if it is the transition matrix of a reversible ergodic Markov 
chain. 

Proof. For the 'if part, since M is the transition matrix of an ergodic Markov 
chain, M must be an irreducible stochastic matrix (column stochastic by con- 
vention in this paper). It therefore has a strictly positive Perron vector, tt > 
for Perron root 1. 

Let B := D^r^^^, and S :— B^^MB. First S wiU be shown to be symmetric. 
Since M satisfies ((20l) by hypothesis: 



M = MB^ = (MB^)^ = B^ 

Using M = BSB-i, 

M = MB^ = BSB-^B^ = BSB, 

B^ = B^ (BSB-i)^ = B^ B^S^B = BS^B. 

So BSB = BS^B, hence S ^ S^. 

Let R = B^'^L for any positive diagonal matrix L, and let symmetric matrix 

S = L-1/2 R-1/2 s L-1/2 R-1/2. 
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This produces the desired M — LSR. 

For the 'only if part, given that M — LSR, use M = BSB^^ from Lemma 
a where B = L^/^ R-1/2 and S = LI/2 RI/2 s LI/2 RI/2. Substituting: 

e^M = e^BSB^ = ^ e^BS = e^B ^ S^Be = SBe = Be, 

hence Be is a Perron vector of S. 

Let TT be the right eigenvector of each M, normalized so that e^n — 1. 
Then: 

M7r = BSB V = 7r ^ SB^tt = B V, 

hence B~^7r is also a Perron vector of S. Since S is irreducible, the Perron 
vector of S is unique (up to scaling, c), therefore 

c Be = B V TT = c B^e = -,1, B^e = ^ 1 , LR^e. 

e ' B2e e LR^e 

Note that = B2 (e^LR^ie)"^ Substituting: 

M = (BSB-i) B^ (e^LR-^e)-! = BSB (e^LR-ie)-\ 

which is symmetric. Therefore, irreducible M = LSR satisfies the condition for 
the transition matrix of a reversible Markov chain. □ 



4 Results 

With these mathematical tools in place, we are ready to analyze the modifier 
models. The core result is the following theorem that the derivative of the 
spectral radius of the stability matrix M^D with respect to each mutation rate 
parameter is negative. 

Theorem 2 (Multivariate, Multiplicative Variation). Consider the stochastic 
matrix 

L 

M^ = (g)[(l-Aie)I«)+A.cP(«)], (21) 

where each P'-^-' is a x transition matrix for a reversible ergodic Markov 
chain. 

Let T) he a positive diagonal matrix. Then for every point fi € (0, 1/2)^, the 
spectral radius of 

M^D = {(g)[(l-Ai5)l(«)+M«P^«^]}D 

is non-increasing in each /i^ . 
// diagonal entries 

^ii ■ • ■ • ■ - ii . ..{'^...i^ 
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differ for at least one pair i^, i'^ € {1, . . . , v^}, for some ii 6 {1, . . . , i^i}, . . ., 
e {1, • ■ • , I'C-i}, H+i e {1, . . . , Vi+i}, . . iL G {1, ■ • • , vl}, then 

MMziHl <o 

Proof. The proof is presented in three sections: applying the canonical form, 
evaluating the derivative, and evaluating the equality case. 

Applying the Canonical Form: 

The first step is to utilize the canonical form Since each P^^^ is the 

transition matrix of an ergodic Markov chain, it is irreducible and thus has 
Perron vector tt^^^ > 0, hence Lemma [2] and Lemma [1] apply. Therefore P^^^ 
has the canonical form 

P«)=B(«)K«)a(«^K«)^B(«)~\ (22) 

where 

B^^) is a positive diagonal matrix, 

K(«) is orthogonal, i.e. K(«)K(«)^ = K(«'^K(«) = and 

A*-^-* is a diagonal matrix of the eigenvalues of P'^^^ with largest simple eigen- 
value 1. 

Define 

Tl.«^=(l-Mc)I<«'+M«A<«'. (23) 

For /ij G (0,1/2), the diagonal entries of T^^^ are all positive. This is 

seen as follows: A^^-* is the diagonal matrix of the eigenvalues of P^^\ which 
are all real due to symmetrizability. Because P^^-' is an irreducible stochastic 
matrix, by Perron- Frobenius theory it has simple largest eigenvalue 1, and all 
other eigenvalues of modulus at most 1. Without loss of generality, arrange the 
indices so that spectral radius corresponds to index 1. Hence: 

[A(«)]ii = 1, and [A.'^%, G [-1, 1) for all i ^ 1. (24) 

Therefore, 



and for i ^ \: 



(25) 
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Substituting 1^ and ^ into I^T^, one gets: 

= (g) ((1 - + m«B«)k(«)a(«)k(«)^B«)"'^ 

(B(«)K(«)[(i-Aie)i(«)+AieA(«)]K(«)^B(«)"'^ 
/'b(«)k(«)t(«)k(«)^b(«)" 

((g) B(«)((g) K(«))((g) T(«))((g) K(«))^((g) B(«)-^) 



L 



?=i e=i c=i €=1 ?=i 



where 



BKT^K' B-\ (26) 



B:=(g)B(«), 
K:=(g)K(«), 



and 



:= (g) T(«) = (g[(l - + A^eA^^)]. (27) 

Since B, K, and are all invertible, they can be rotated in sequence without 
altering the spectrum_(Le. p(Ai A2A3) = p(Aj^^ Ai A2A3A1) — p(A2A3Ai)). 
The key step from lKarlinI (|l982l Proof of Theorem 5.1, pp. 197-198) is to rotate 



the terms into a symmetric form: 

p(M^D) = p(B KT^K^ B-iD) = p( KT^K^ B'^DB) 

= p(T^K^ B-^DBK) = /9(TJ/2k^dKT)/2), (28) 

sinc«llB-iDB = D. 

To this symmetric form one can apply the Rayleigh-Ritz variational char- 
acterization of the spectral radius (Karlin. .1982, p. 198, Wilkinson .1965. , pp. 



•^The cancellation of the diagonal matrix B is the step that is blocked when the modifier 
recombines with the loci under selection, giving r > in in which case, D is replaced 

t'y [(1 ~ + r Dzj^W/u7] (assuming T^,;, = T^j^), and so instead of the symmetric term 
B-i D B = D, one has B-i[(l - r)D + r/W Dj^W]B = (1 - r)D + r B-^Dj^^WB /W, 
which is generically not symmetric, thus precluding use of the Rayleigh quotient at this step. 
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172-173. iHorn and Johnsoij|l985l pp. 176-180). The Rayleigh-Ritz formula is, 
for any symmetric real matrix A: 

p(A) = sup (29) 

x#0 X X 

Let x(/i.) be a vector, constrained to the unit sphere, x(/x)^x(/x) = 1, that 
maximizes 

</.(x) :=x^(Y;,/2K^DKTy2)x. 
Then x(/x) is an eigenvector satisfying: 

(Ty^KTDKT;,/^) x(/x) - p(M^D) x(m). (30) 

Pre-muhiplying each side of dSHl) by BKY^/^: 

p(M^D) BKT^^ i(^) = BKY^K^B-iD(BKYy" x(/x)) 
= M^D (BKT^^ x(/x)), 

therefore BKYj/^x(/i,) is the eigenvector of M^iD associated with the spectral 
radius, unique since M^D is irreducible, so call it 

v(/x):=BKT;,/2x(m). (31) 
Since B, K, and T^' are all invertible, 

i(/.) = T^i/^K^B-i v(/x). (32) 
Evaluating the Derivative: 



Differentiating ([28)) with respect to the mutation rate /i^ at the Kth locus under 
selection: 

-^p(M^D) = 2x(M)^(Ty2K^DKYV2)^^('^) 



2x(/.)T(Yy2KTDK^j^)x(M). 



As in Karlin's proof of Theorem 5.2 (|1982l p. 195), si nce x(/j) maximizes the 
quad ratic function 0(x), it is a critical point of 0(x) (jPuistermaat and Kolk . 



2004 p. 72), therefore a0(x)/ax|^(^) = 0, so 



a0(x) 

Using 



2i(/x)T(Yy2KTDKYi/2)^JM 



dfJ-K 2 
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this leaves: 



where 9T^/(9/Zk evaluates to: 



d ^ ^ _d_ 



)[(1-M«)I(«)+M«A(«] 



and 



[(l-/.4)l(«)+MjA(«)], 



(33) 



(34) 



[(g)l(«)]®diag 



(1 - /is) +/iK[A 



t(C)i 



(35) 



Using dSOl) one can substitute ^ifi)^ (TI/^K'^BK) ^ p(M^D) i(/x)^T^^/2 
into ([55)1 and obtain: 



— p(M^D) = p(M^D) i(/.)^T^V2x^i/2_^^(^) 
= p(M^D) i(/.)^T^i^x(M). 



(36) 



From ([M)) , one sees that all the terms in ([M)) are positive except for the term 
^(i^) „ rpj^^ ^gj.^ ^(k) _ -g ^ diagonal matrix with [A^"']!! -1=0, 
and for i 7^ 1, negative diagonal entries, [A^'^-'j^i — 1 < . 

Thus for e (0,1/2)^, 9X^/9^^ and 9X^/9^^ are negative semi- 
definite: 

<^ and T^i|^ <^ 0. (37) 



9//« 

Therefore evaluates to: 



— p(M^D) < 0. 



(38) 
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The restriction of the rautation rates to the interval (0, 1/2) is justified empir- 
ically, but their motivation here is analytic. The open interval on the side 
is done to avoid the technical details of derivatives on a boundary. The open 
interval on the 1/2 side is more than technical: if a mutation rate is allowed to 
be 1/2 or greater, then the terms (1 — /i^) + fi^[A''^^]ii may be or negative, 
invahdating (l37l) . 



Evaluating the Equality Case: 

The conditions that allow equality in (p8)) are elucidated, and the work consists 
mostly of tracking the zeros through the equations. Representing ([36]) in terms 
of individual entries: 



= p(M^D) g 



[A 



(«)l 



1 



< 0. 



(39) 



^1—1 ^n-i— 1 



In order for 9p(M^D)/9/iK — 0, every index i where [T^^ dY^/dK\ii is non- 
zero must have Xi{fi) = 0. So, either [A^'"-'] 



- 1 = or Xi^i^...i^...i^ = 0. But 



[A^'^'Ji^j^ = 1 only for = 1, hence 9p(M^D)/9(UK = if and only if 

Xtii2-t^-iL = for aU ^ 1 and all zi, . . . , . . . , i^. (40) 

This condition on x(/x) can be translated into a condition on D. Using (PH)) : 
p(M^D) A(/x) = rl/'K-^BKYl/^kifj,) = Tl/'K-^B-^BBKrl/^iitJ,). 
Pre-multiplying each side by [T^^K^B-i]-! = BKT^^/^: 

p(M^D) BKY^i/2x(/.) = D(BKT;,/2^(/.)). (41) 

Here, it becomes notationally helpful to let k be either the first or last index. 
Since there is no actual spatial structure or consequence to the ordering of the 
loci in the absence of recombination, the following derivation applies to any 
locus K. Using K = L, the x(/i) satisfying pOj) can be written as: 



x(/x) 



1 







(42) 



4=1 



for some vector y (were k in the middle, trying to write it = y [100 • • • 0]^ ® y' 
forces a Kronecker factoring of x into y and y', which is not implied by (001) )• 



An Evolutionary Reduction Principle for Mutation Rates at Multiple Loci 24 



Substitution of (021) into (ISH) gives: 
v(m) = BKT'J'^ip) 

(^'B«)K«)T(f/')y 



L-l 



((g)B(«)K(«)T(f/')y 



B(^)[K(^)]i, 



(43) 



where [K^^^Ji is the first column of K'^-'^^ since 

= B(^)k(^) [10...o]T = B(^)[K(^)]i. (44) 
By construction, B*-''^-' [K^'^^Ji is the Perron vector of irreducible P'-^-', hence 
Substituting ^ and dS]) into (gl]) gives: 



D 



{§B(«)K(«)T(f/^}y 



B(^)[K(^)]i 



p(M^D) 



L-l 



((g)B(«)K(«)T(«)-i/2)y 



,bW[K(^)]i. (45) 



Equation becomes clearer if the terms are represented by single symbols. 
Let 



L-l 



f := ((g)B(«)K(«)T(f/') y, 

g := (§B(«)K(«)T(5)-^/')y, 

k ■.= B^^\K^%, and 
p :=p(M^D). 

Then 



and becomes: 



V = f (g) k, 
D(f (8)k) = /9 g(8)k. 



(46) 
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Now (pS)) may be expressed in terms of the entries: Let ii index the haplotypes 
of all loci except L, and 12 index the alleles of locus L. Then (|46)) is represented 
as: 

^1112 f 11^12 P 9ii ('^'^) 

for each ii,«2. Since ki^ > 0, this implies Di-^i^ fi-^ = p gi-^ for all 12- Because 
M^D is irreducible, 

V > 0, (48) 

therefore fi^ > 0. Then Di-^i^ = p gi^/ fi^ for all 12- Thus the chain of impli- 
cations that starts with 9p(M^D) = concludes with the finding that D 
must be of the form 

D = D«)l(^) 

where D = diag Di-^ . 

To summarize the equality case, recall that the above derivation applies with 
respect to any locus k. Thus, if and only if 

D^l■■■^.■■■^L^D^,■.■^'^■.■^, (49) 

for at least one pair *It £ {1, • • • : i^k}, for some ii € {1, . . . , vi}, . . ., i^-i € 
{1, . . . , v^-i}, i^+i G {1, . . . , fj+i}, . . ., e {1, . . . , i^l}, then 



ap(M^D) 



< 0. □ 



Remarks. In the case where [D]ii — for some i, M^D is no longer 
irreducible, and so a unique positive Perron vector v for M^iD is no longer 
guaranteed. If the set of ['D]u = entries dissects the haplotype space into mul- 
tiple non-communicating sub-spaces, each of these is represented by an isolated 
block in the Frobenius normal form of M^D, thus M^iD will have multiple non- 
negative eigenvectors. This situation is more complicated than merits pursuit 
here. However, when the [T)]ii = entries do not destroy the uniqueness of v, 
it yields a ready result: 

Corollary 1 (Lethality Case). Let all the conditions of Theorem\^ apply ex- 
cept that [T)]ii — for at least one i. // M^D has a unique eigenvector v(/li) 
associated with eigenvalue /9(M^D), then 

ap(M^D) 







if and only if 



Dii . . . 4 ..Or — L) ■ -I 

*1 *K *L «1 • • • *K ' ■ ' *i 



for all iK,,i'i^ £ {1, . . . , z^k}, whenever 

Wii - j^ - zi > 0, and Vi^...i'^...t^ > 0. 



Othe 



5p(Mzi^<0. 

dpn 
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Proof. The positivity of D enters the proof of Theorem [5] only at step ([5T|) . 
Assuming the uniqueness of v allows one to preserve (|3T|) . The first consequence 
of relaxing the positivity condition does not occur until after (|48p when it can 
no longer be assumed that fi^ > 0. Continuing with the notation introduced 
at (gni), Aii2 fiiki2 = P 9ii fci2 is satisfied as long as Di^i^ = p gi^j fi^ for all 
12 whenever fi^ > 0, that is, whenever Vi^^i^ > 0, which is what is stated in the 
corollary using the multilocus notation. 

There are further implications for D (these do not alter the statement of 
the corollary): Di^i^ fi^ ~ P gin so if some Di^i^ = 0, then gi^ = 0. But 
then Di^ii fi-^ = for every 13. Consequently, either /^^ = 0, which means 
Viii2 > for every 12, or Di^i'^ = for all ij. Thus, under the condition that 
dp(M.^jT)) / dpL = 0, the existence of one lethal haplotype 1112 implies either 
that all haplotypes with ii are lethal, or that all haplotypes with ii are absent 
from the population. 

In the latter case, fi^ = 0, there are further implications. Recall that 

D(f (8)k) = p g(g)k, and p v = p f «)k = M^^Dv = M^D(f (g)k) = p M^(g®k). 
So = if and only if Vi-^i^ — fi^ki^ — 0, since ki^ > 0. Thus 



The zero sum mandates that gj-^ = for every ji in which Mi-^i^^j-^j^ > for 
some ?2,j2- By p6|) . gj-^ = implies Dj-^i^ jj^ = for all ii, requiring that 
either fj^ = 0, or Dj^i^ — for all 12- 

In the case where /j^ = 0, then the above argument applies in turn to it. So 
consider the entire set Z = {i[ : //^ = 0}. 

If Mi-^i^^j-^j^ > only when both ii, ji G Z, that means M^^i^ ji^j^ = for all 
j'l ^ Z. But that means there is no mutation to Z from outside of Z, which 
makes M^i reducible, contrary to hypothesis. Therefore there must be some 
Mi^i^j^j^ > that has ii ^ Z and ji ^ Z. And ji ^ Z implies Dj-^i^ = for all 

Therefore, if 9p(Mp.D)/9/ii, — 0, the existence of one lethal haplotype 1112 
implies either that all haplotypes with ii are lethal, or that all haplotypes with 
ii are absent from the population, which implies further that all ji in the pop- 
ulation that can mutate to ii are lethal for all haplotypes Jij2- D 

Corollary 2 (Multiple Cell Divisions). One may substitute for in 
Theorem where t is a positive integer, and the theorem applies otherwise 
unchanged. The partial derivatives, however, are all scaled by t. 

Proof. The proof is identical to that of Theorem [2] except that one is seeking 
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Following the same sequence of steps as for Tlieoreni[51 let x(/x) confined to the 
unit sphere produce the maximum of 

</-(x) :=x^(Tf K^DKYf )x. 

Then x(/x) is an eigenvector satisfying: 

(T^/^K^DKTf ) = piMlB) x(At). (50) 

Following the same steps of differentiation: 

^p(M^D) = 2i(/x)^(TfK^DKTf )^ 

+ 2X(/.)^(Yf K^DK^)x(m) 



Utilizing 



one obtains 



= t x(/x)^(Yf K^DKTf -1 l^)*^/^)- 
x(/.)^TfKTDK =p(M;,D)x(M)^T^*/^ (51) 



— p(M;,D) = t p(M;,D) x(/.)^Y^*/2Tf -1 ^ x(/.) 

= ip(M^D)x(/.)^(T^i -Qj^)Ht^) 

which is identical to ([55)1 except for the presence of t. Since T^^ 9T^/9/i„ is 
negative semi-definite for fi £ (0, 1/2)^: 

^p(M* D) < 0. (52) 

The steps in the equality case are unchanged except for the scaling factor t, 
and the substitution of </2 for 1/2 in the powers of T. □ 

Corollary 3 (Global Mutation Rate Control). Let all the conditions be identical 
to those in Theorem except that the modifier locus controls a single, global 
mutation rate 7, scaling all the = 7/?^ parameters equally: 

L 

M^-(g)[(l-7/3s)I^«^+7/3«P^^^]- (53) 

Then the spectral radius of M^iD is non-increasing in 7 for 7 G (0, 1/2), and 
strictly decreasing i/ D 7^ cl for every c > 0. 
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Proof. This follows directly from the fact that g^/9(M^D) < for any fi G 
(0, 1/2)^. If D ^ c I for any c > 0, then for at least one ^, ^p(M^T>) < 0, 
hence ^p(M^D) < 0. This can be shown explicitly. 

Letting /i^ = 7 13^, we wish to evaluate ^p(M^D). The derivation is 
identical to the steps in the proof of Theorem [2] except that d/c?7 replaces 
d/dfj,^, until step (p4|) . which becomes: 

® ^4 [A*") 
(g) [(l-7/3e)l(«)+7/3^A(«] 

-V^ At 

",tt 7 m 

Applying this expression yields a positive weighted sum of partial derivatives 

= E7|^P(M.D). (54) 

By (|37p . each of these partial derivative terms in ([Ml) is non-positive, so 
^/5(M^D) < if at least one term is non-zero. To have all partial deriva- 
tives be requires D = c I for some c > 0, hence ^/3(M^D) < if D 7^ c I for 
every c > 0. Note that if the modifier scales the mutation rates of only a subset 
of loci, then the sum in ([5l| is replaced with a sum over that subset of loci. 
Hence the magnitude of ^p(M^D) increases with the number of loci affected 
by the modifier, given fixed /3j values. □ 

4.1 Neutral Surfaces of Mutation Rates 

Theorem [2] shows that if the marginal fitnesses at equilibrium do not depend on 
the allelic state of a particular locus (i.e. no instance of (jlH]) occurs), then the 
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mutation rate for that locus can be varied without changing the spectral radius 
of the stability matrix. This trivially defines a surface of points rf £ (0, 1/2)^ on 
which p(M^D) is invariant. Let us exclude this degenerate case for this section, 
and assume that the marginal fitnesses at equilibrium depend on every locus. 

Under this assumption, p(M^D) strictly decreases in each variable This 
raises the question of how p(M^D) and p(M^D) compare for two vectors fj, 
and rj when < for some k, but /X{ > 77^ for some other ^, i.e. /x and rj are 
not ordered componentwise, and neit her p, < n, nor > rj. 



ordered componentwise, and neit ner p, r i , nor p, > rj. 
The Intermediate Value Theorem ( Munkrea 1975[ p. 154) tells us that there 



must be a set, Af{fJ.) C K", surrounding fj,, on which /9(M^D) = p(M^D) for 
all rj S J\f{p)', this is b ecause p(MttD) is a conti nuous function from the matrix 
entries of M^D to R dHorn and Johnson . 1985 . pp. 539-540), and the entries 



of are continuous functions of each /i^. The following properties will be 
shown for this set J\f{p): 

1. A/'(/x) passes through every orthant surrounding p except the strictly pos- 
itive and strictly negative orthants; 

2. A/'(/x) disconnects the mutation parameter space (0,1/2)^ into two con- 
nected parts; and 

3. A/'(/x) is an L — 1 dimensional smooth manifold. 

In a series of lemmas, the first two properties are established for arbitrary con- 
tinuous, strictly decreasing functions, and the third is established for arbitrary 
differentiablc functions with negative partial derivatives. These lemmas are then 
applied to the mutation rate model, in Theorem |4l 

Lemma 3 (Orthants). Let F: — > K fee continuous and strictly decreasing 
in each variable rui of m € M"^ . 

Let the orthants o/R^ be represented as follows: (I'^,X^,I^) represents a 
three-way partition of the indices i — 1 . . . L. The orthant Q{I~^ ,1^ C 
is defined as: 



Q(I+,I=,X") = {m G M^: < nn = \/ i e 1= }. (55) 




Then, for any non-empty choices of subsets Z+ and I , there is some 
q e (3(X+,I=,X") such that F(m + q) = F(m). 

Proof. Choose an arbitrary q G (3(I+,I^,I^). If F(m + q) = -F(m), then one 
has found the sought-after q. 

If F(m -I- q) > -F(m), then one can construct a q' such that F{m + q') < 
F(m) < i^(m -f q), and then find the desired value between m + q and m + q': 
increase all the negative elements of q to to define q' : q[ = for all i G T^ , 
q'. = for i G !+,!=. Thus q' G Q(I+,I= UX~, 0). By the monotonicity of F 
one knows F(m + q') < F(m + q). And q' G Q(X+,I= Ul", 0) means q' >^ 0, 
so F(m + q') < F(m) < F{ni + q). 
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Now consider the convex combination q(a) := (1 — Q;)q + aq'. Since F is a 
continuous function, with F(m + q(0)) = F{ni + q) > f (m) > f (m + q') = 
i^(m + q(l)), then by the Intermediate Value Theorem ( Munkres . 19751 p. 154), 



there is some a £ (0, 1) such that F(m + q_{a)) ~ F{m.). We must verify that 
q(d) £ Q{I^ ,1^ as required: for i G I^, q{a)i = (1 — a)qi + aq[ = 
(1 — a)gi < 0. So q_{a) is the desired point. 

If F{m + q) < -F(ni), the mirror argument apphes, and we decrease the 
Qi: i to make a new point with q[ = Q giving F{m + q') > F(m) where 

q' € (5(0,1+ U I", I^). Analogously, we know there is d that yields F{m. + 
(l-d)q + dq')) =F(m). □ 

Lemma 4 (Connected Regions). Let F : (0, c)^ ^ R 6e continuous and strictly 
decreasing in each variable rui o/ m G (0,c)^ C M^, where c > is a con- 
stant. Then the set M{m) = {m': F(m') — F(m)} disconnects (0, c)^ into two 
connected sets. 

Proof. To show that the set A/'(m) disconnects (0,c)^, it is sufhcient to find 
two points in (0, c)^ such that every continuous curve between them intersects 
A/'(m). For the given m G (0, c)^, let qi :— min(mi,c — mi)/2 > 0. The two 
requisite points will be m — q, and m + q. Clearly < rn^ — qi < rui < 
+ < c, so m — q, m + q G (0, c)^. Since F decreases in each variable, then 
i^(m — q) > F{m.) > i^(m + q). Define the continuous curve C: [0, 1] i— )> (0, c)^ 
to have C(0) = m — q, and C(l) = m + q. Since F is continuous and C is 
continuous, then F o C : [0, 1] — > R is continuous. Since 

F(C(0)) = F{m - q) > F{m) > F{C{1)) = F{m + q), 

by the Intermediate Value Theorem there must be some a G [0, 1] such that 
F{C{a)) — F{m.), which means C{a) G A/'(m). Thus every continuous curve 
between m — q and m+q intersects A/'(m). Therefore A/'(m) disconnects (0, c)^. 

To show that (0,c)^ — Af{m) consists of two connected sets, it is sufficient 
to show that between any pair of points in the same set, there is a continuous 
curve C (0, c)^, that does not intersect A/'(m). The two sets are 

5-(m) := {p G (0,c)^: F(p) < F(m)} 

and 

S+{m) := {p G (0,c)^: F(p) > F(m)}. 

Clearly 5^ (m), A/'(m), and iS+(m) are disjoint, and 5^(m)UA/'(m)U5+(m) = 
(0,c)^. Now we shall see that 5~(m) is connected, and iS+(m) is connected. 

For any two distinct points p, p' G iS^(m), construct a new point, p", that 
combines the maxima from the two points: p" = max(pi,pj). Let one curve be 
the line from p to p", {(1 — a)p + ap" : a G [0, 1]}, and the other curve be the 
line from p' to p", {(1 — a)p' + ap" : a G [0, 1]}. Both curves are clearly within 
(0,c)^. The union of the two curves forms a continuous curve between p and 
P'. 
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Now it must be verified that the curves do not intersect A/'(m) . For the line 
between p and p": p'- > pi for each i, so (1 — a)pi + ap'l > pi for all a S [0, 1]. 
Therefore i^((l -a)p + ap") < F{p) < F{in) for all a G [0, 1]. So aU the points 
on the line remain within S~ . The same applies to the other line between p' 
and p". Thus there is a continuous curve connecting p and p' G 5~(m) that 
does not intersect A/'(m). Therefore S~{xn) is connected. The mirror argument 
applies to 5+(m). □ 

Lemma 5 (Smooth Manifold). Let F : (0,c)^ — !■ R fee a smooth map, and 
strictly decreasing in each variable rui o/ m £ (0,c)^ C M}" , where c > is a 
constant, and L > 2. Further, let dF{m) / dnii < for each i. Then the set 
A/'(m) = {m': i^(m') — F{ui)} is a smooth, L — 1 dimensional submanifold of 
(0,c)^ 

Proof. Th e proof is immediate usin g a general form of the Implicit Function 
Theore m dSinger and Thorpel 19671 p . 135)), referred to as the Preimage The- 
orem in IGuillemin and Pollack! ( 1974L p. 21), or the Regular Value Theorem 
(|HirschL fl976l Theorem 3.3 p. 22). which I restate: 



Theorem 3 (Implicit Function). ItSinaer and Thorv^ . \l967 . p. 135). Let X 
and Y be smooth manifolds, with dimX > diuiY . Let -0 : X — >• K be a smooth 
map. Let i/q G ijj{X) and let 



Assume that for each x G Xq, dip{x): T{X,x) 
the dimX x dimK matrix 



T{Y,ip{x)) is surjective, i.e. 



(9/5a;j)(y, o-0)|a 



is full rank, diml". Then Xq has a manifold structure, whose underlying topol- 
ogy is the relative topology of Xq in X , and in which the inclusion map Xq — > X 
is smooth. Furthermore, dimXp = dimX — dimY . 



Here, let X = 
Xq — A/'(m), and 



(0,c)^, Y 



tp = F, Xq = m, dimX = L, dimY = 1, 



dip{x) = {d/dx.j){yi o ^)\^ 



dF{va)/ dnii 



Here, dip{x) is surjective if dF{m) / drui ^ for at least one i. In fact, by 
hypothesis dF{m) / drui < for every m and i (so every value i^(m) is a regular 
point, making F a submersion). Thus, 7V(m) is a smooth submanifold of (0, c)^ 
with dim7V(m) = dim(0, c)^ - dimR = L - I. □ 

These lemmas are now applied to the modifier model: 

Theorem 4 (Manifold of Neutral Mutation Rates). Assume the conditions 
of Theorem\^ For any given mutation rate vector fi G (0,1/2)^, the set of 
mutation rate vectors that produce the same spectral radius as (j,, J\f{p) = {r; G 
(0,1/2)''": p{Mr,'D) = /9(M^D)}, has the following properties: 
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1. There is some rj S in every orthant around fj. except the orthants 
V fi, and rj >^ fi; 

2. Af{fJ,) disconnects the mutation parameter space (0,1/2)^ into two con- 
nected parts, S~ (fi) and S^{fj,), such that p(M^D) < p(M^D) for all 
T] e S^ifi), and p(M^D) > p(M^D) for all rj e S+ip). 

3. M{lJi) is a smooth manifold of dimension L — 1, which is a subset of an 
affine algebraic variety. 

Proof. Let F{fjb) :— /9(M^D). From Theorem [21 p(M^D) is continuous and 
strictly decreasing in each mutation rate /i^- This satisfies the conditions of 
Lemma [3l and estabhshes [TJ Further, with c = 1/2 the conditions of Lemma 
ID are satisfied and [H estabfished. As Lemma [5] requires, (0,1/2)^ and K are 
smooth manifolds, and p(M^D) is a smooth map with respect to fi when M^D 
is irreducible as it is in Theorem[2] (since for simple eige nvalues, all orders of par 



tial d erivatives with respect to the matrix entries exist ([Deutsch and Neumannl . 
Il984 p. 2)) . The last requirement of Lemma [S] is met since Theorem [3] shows 
9/9(M^D)/9/Xk < for each i = 1 to i, therefore |3l is established. 

It can be seen that A/'(/x) is a subset of an affine algebraic variety, because 
A/'(/x) C (0, 1/2)-^ n V, where V is the affine variety 

L 

V = {J7 : det[(g)[(l - r;^ + 7j^P^i^]B - p(M^D) I] = 0}. □ 

5=1 



4.2 Main Results 

Theorem [21 Corollary [21 and Theorem [31 may now be applied to the dynamics 
of the modifier gene model: 

Theorem 5 (Multivariate Reduction Principle for Symmctrizablc Mutation 
Rates at Multiple Loci). Consider a genetic system in which a modifier locus 
controls the mutation rates of a group of loci under viability selection. Muta- 
tions occur independently among the loci under selection. In a population near 
equilibrium under a stable mutation- selection balance, fixed at the modifier lo- 
cus, let a new allele of the modifier locus be introduced. The new modifier allele 
can change the mutation rate parameter separately for each locus, and each pa- 
rameter scales equally the probability of mutations at that locus. 
Under the following constraints: 

1. mutation rates at each locus range between and 1/2, 

2. no recombination or other transformation process acts on the genes, 

3. the mutation matrix for each locus is irreducible, and also irreducible when 
restricted to nonlethal alleles, 

4-. is the transition matrix for some reversible Markov chain. 
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then the new modifier allele will increase (decrease) in frequency at a geometric 
rate if, among the loci that affect the marginal fitnesses of the haplotypes present 
in the population: 

1. it reduces (increases) the mutation rate at any locus, and does not increase 
(decrease) the mutation rates at any locus; 

2. it increases the mutation rates for at least one locus, and decreases the 
mutation rates for at least one locus, and falls below ( above ) the neutral 
manifold of mutation rates that includes the mutation rates at the equilib- 
rium. Should the mutation rates produced by the new modifier allele fall 
on this neutral manifold, then it will not change frequency at a geometric 
rate. 

Moreover, the further that the new set of mutation rates is from the neutral 
manifold, the stronger is the eventual induced selection for (against) the new 
modifier allele, up to a maximum fitness of maxi Wi /W for a modifier allele that 
eliminates all mutation. 

These results hold, in the case of multicellular organisms, for arbitrary num- 
bers of cell divisions between gamete generations. The strength of selection on 
the modifier locus scales in proportion to the number of cell divisions in the 
germline, and increases with the number of loci controlled by the modifier. 

Proof. In the single eell-division model, the population begins at equilibrium 
fixed on modifier allele b which yields mutation rate vector fi, and ([5]) becomes: 

Zb = Mfj, D Zft, 

Therefore, in dSI]), v(/^) = Zb, and p(M^D) = l|3 Let T] be the vector of 
mutation rates produced by the new modifier allele a. li f] < fi and 77^ < /i^ for 
some locus k for which the equilibrium marginal fitnesses depend on the alleles 
at locus K, then by Theorem[21 p(M^D) > p(M^D) = 1, so new modifier allele 
a increases at a geometric rate. The mirror argument applies when r] > fi and 
Vk > jJ-K. for some locus k for which the equilibrium marginal fitnesses depend 
on the alleles at locus n, in which case the new modifier allele will decrease at 
a geometric rate. 

In the case where 77^ > /x^ and rjj < fij for some k ^ j, Theorem[4] establishes 
that there is a smooth L — 1 dimensional surface A/'(/x) that dissects this orthant 
surrounding fi into a set below A/'(/x) in which /3(M^D) > 1, and a set above 
in which p(M^D) < 1, the new modifier allele increasing in frequency in 
the former case, and decreasing in the latter case. 

If T7 e A/'(/x), then by definition p(M^D) = 1, so the new allele will not 
change frequency at a geometric rate. 

By 'further from' the neutral manifold, I mean a partial ordering of mutation 
rate vectors in which /Xj^ -< ^2 if Mi ~ 1^2 {"ciot equal for at least one 

*Thc use of this equilibrium relation, without having to expli citly solve for the equilibrium, 
was first introduced into modifier gene theory bv lTeagud l|l97l p. 89). 



An Evolutionary Reduction Principle for Mutation Rates at Multiple Loci 34 

locus that the equihbrium fitnesses Wi depend on). Since the derivative of 
p(M^D) is negative with respect to each variable 77^ when locus f affects lii, if 
Hi ^ H2 -< fJ,^, then p(M^^D) > p(M^^D) > p(M^^D). For a modifier allele 
that eliminates mutation, fj, — 0, so p(MoD) = p(D) = 1 + V = max^ Wi/w. 
In the multiple cell-division model, the initial equilibrium satisfies: 

Zb = D Zfe, 

so p(M^D) = 1. From Corollary[2l we see that letting t>\ does not alter the 
inequalities on the spectral radius, so the same conclusions apply for all t. □ 

5 Discussion 

The motivation for the paper was to extend the general theory of modifier genes 
beyond single event models and the constraint of linear variation. Here, multiple 
independent mutations among multiple loci are modeled, with a modifier gene 
that has arbitrary control of the mutation rates at each locus. Under this 
multivariate, multiplicative form of variation, the reduction principle is again 
found to hold. In particular: 

1. The result applies for arbitrary selection coefficients on the diploid geno- 
types (with some technical constraints on the global pattern of any lethal 
genotypes), arbitrary mutation rates and mutation distributions and as 
long as they are symmetrizable, arbitrary numbers of (tightly linked) loci 
and alleles, arbitrary control over each single-locus mutation rate, and any 
number of cell divisions in the germline. 

2. Changes in the mutation rate at a locus will be neutral if the alleles at 
that locus do not make any difference in the marginal fitnesses of the 
haplotypes under selection. 

3. There is a surface of mutation rates that a new modifier allele can produce 
that leave it neutral, i.e. it will not change frequency at any geometric 
rate when introduced into the population. 

4. Mutation rates that fall below this surface will cause the new modifier 
allele to increase when rare, and rates above this surface will cause it to 
go extinct. The surface is such that the modifier allele can increase the 
mutation rate at some loci and decrease at others — for any arbitrary 
choice of loci that affect the marginal fitnesses at equilibrium — and there 
will always be some values for the magnitude of these changes that fall 
below the neutral surface of mutation rates, and other values that fall 
above. 

5. The strength of selection on a new modifier allele increases with the dis- 
tance of its rates from the neutral surface of mutation rates, which in- 
creases with each locus affected, and it increases with the number of cell 
divisions in the germline. 
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Two properties of modifier polymorphisms are also shown: 

1. The "viability analogous, Hardy- Weinberg" modifier polymorphisms that 
emerge in single-event models cannot exist under multivariate, multiplica- 
tive variation in transmission due to the loss of convexity in of the space 
of transmission values. 

2. When the modifier locus is polymorphic, the only values that matter to the 
change in frequencies of the loci under selection are the mean transmission 
probabilities for those loci; the frequencies and associations of the modifier 
alleles are otherwise irrelevant. 

5.1 Q & A 

Since the implications of the main results may not be immediately apparent, 
an attempt to elucidate them is provided through the following 'Question and 
Answer' format. 

Q.l. What new phenomena are found in these results? 

A. While the general result that mutation rates evolve to decrease is not 
novel, several phenomena are: 

1. Increases in mutation rates may evolve if they are compensated for 
by decreases at other loci (see Section [531 below) . 

2. The strength of selection for (against) a new modifier grows with 
the number of loci whose mutation rates it decreases (increases) (see 
Corollary |3|). 

3. Mutation rates of loci that do not affect the marginal fitnesses at 
equilibrium may be changed 'with impunity' by the new modifier 
allele, including when they are changed as a side effect of changes in 
the mutation rates at other loci. This implies — other things being 
equal — that if there is local tuning of mutation rates, then neutral 
loci should have greater mutation rates that loci held in mutation- 
selection balance (see Section [5. 3. ip . 

4. The reduction principle applies when there are multiple cell divisions 
in the lineage from zygote to gamete, and the strength of selection on 
the modifier locus scales in proportion to the number of cell divisions 
from zygote to gamete (Corollary [2|). 

Q.2. Why is this called 'evolutionary reduction' if it is possible for some muta- 
tion rates to evolve an increase? 

A. It is a 'reduction result' because mutation rates must be below the 
neutral manifold in order for the new modifier allele to invade — making 
the neutral manifold like a wall (see Section [5.3.2|) . Also, the further below 
this wall that the mutation rates are, the stronger the induced selection 
for the modifier allele carrying them. 
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Q.3. Could there be some sort of complex epistatic multi-locus selection regime 
that would allow mutation rates to get around this wall? 

A. No, the neutral manifold emerges for all possible selection regimes, 
the only 'holes' in the wall being the mutation rates of loci that do not 
affect the marginal fitnesses at equilibrium, which are free to evolve in any 
direction. 

Q.4. Doesn't the reduction result depend on the assumption that most muta- 
tions are deleterious? Why isn't this assumption stated anywhere in the 
model? 

A. The reduction result does not depend on any assumption that most 
mutations are deleterious — and that is why it takes some mathematical 
machinery to show it. What it does depend on is a net flux of mutations 
at equilibrium from more fit to less fit haplotypes, which is a necessary 
and emergent property of mutation-selection balances. By net flux I mean 
an absence of the 'detailed balance' condition that characterizes the sta- 
tionary state of reversible Markov chains, ([20l) . in which the fraction of 
the population mutating from type j to k equals the fraction mutation 
from k to j. At a mutation-selection balance, a net flux is necessary to 
keep the haplotypes with above average fitness from continuing to grow 
in frequency, and to keep the haplotypes with below average fitness from 
continuing to decline in frequency. This out come will occur regardless of 



how the distribution of fitness effects (DFE) (jEvre- Walker and Keightlev 



is set for each diploid genotype by nature 

By altering the flux, the new modifler allele unbalances the mutation- 
selection balance within the subpopulation that contains it. It is not 
immediately obvious why a reduction in the flux equally across all haplo- 
types (linear variation) would create a subpopulation with increased mean 
fitness, because the fiow is reduced in both directions: from less fit to more 
fit, and from more fit to less fit. But the net effect is alw ays to increas e 



the subpopulation's mean fitness, as shown by Theorem 5.2 lKarlinl (|l982l ). 
Here, fluxes are scaled equally between all single locus alleles, multiplied 
across loci, and Theorem [2] gives the multivariate reduction result. 

Q.5. In nature, are not the rates of mutations that affect the phenotype so low 
that multiple mutations in a gamete are very rare? — in which case, don't 
the results here reduce to the classical results for single events? 

A. No, for several reasons: 

1. Phenomena [H and [3l in QH] above, are novel to the multivari- 
ate control of mutation rates, and are not eliminated in the limit of 
small mutation rates. In this limit, when multiple mutations are rare 
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enough to be ignored, 

L 

M;. = (g)[(l-M5)l(«)+/icP^«'] (56) 

Ignoring the 0{^'^ ) terms, the space of variation M = {M^} becomes 
convex, and the "viabihty analogous. Hardy- Weinberg" equihbria be- 
come feasible. A modifier allele that scales all /xj eq ually will produce 



linear variation, as is covered by earlier treatme nts (|Altenberei . 11984 : 



Altenberg and Feldmanl . Il987t lAltenbertd . [20091) . But modifiers that 



change the ratios between the /i^ do not produce linear variation, and 
are subject to phenomena [T] and [31 in QH] above. 

Mutation rates observed in organisms are a ctually not srnall en ough 
to ignore multiple mutations. For example. Roach et al. ( 2010t ) esti- 



mate that humans have some 70 new nucleotide mutations per diploid 
genome per generation. On a per-cell division basis, this puts the hu- 
man germline mutation rate lower than that recorded for any other 
species (,Lynch, 201 j). For the fraction of these mutations that have 



phenotypic effect . lEvre- Walker and KeightlevI (|2007l) summarize sev- 



eral studies that estimate the proportion of the genome subject to 
natural selection at around 5% in mammals. 

Letting A be the number of non-neutral mutations per haplot ype per 
genera tion, this yields and estimate of A = 0.05 x 70/2 = 1.75. iLvnch 



(|2010h gives a concordant estimate of 0.9 to 4.5 deleterious mutations 



per diploid genome per generation, or 0.45 < A < 2.25 per haplotype. 
With this magnitude for the expected number of mutations, a mod- 
ifier allele that changes the global mutation rate will not be pro- 
ducing linear or even convex variation. When A = /i i << L, the 
multiplicative model is approximated by a Poisson process, where 
the probability of parent j producing gamete i with v mutations is 
A'^e^^/i^!. The ratio between gametes with multiple mutations and 
gametes with single mutations is: 

Pr[l] Ae-^ A^ ' ^(iy + 2)\ 

At the small mutation rate limit, 

lim ^^7 J = 0, 
A^o Pr 1 



but for A = 1.75, 

:{e^-^^ - 1) - 1 = 1.72 » 0. 



Pr[> 2] _ 1 , ,.75 



Pr[l] 1.75 
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Here there are about twice as many multiple mutations as single mu- 
tations. The range 0.45 < A < 2.25 gives 0.26 < Pr[> 2]/Pr[l] < 2.8. 
So multiple mutations cannot be ignored in ([55]) . Other eukaryotic 
species whose mutation rates have been measured give rates of de le- 
terious mutations of A > 0.5 ( Kondrashov and KondrashovibOld p. 



1171). Therefore, when the expected number of mutations is on the 
order of 1, the multiplicative model is not approximated by the clas- 
sical model of linear variation and requires an approach such as taken 
here. 

Lastly, a full picture of the evolution of mutation rates must take 
into account not only the 'wild type' mutation rates — which are 
the endpoint of the evolutionary process — but also the full range 
of mutation rates that organisms are capable of generating, because 
they are the values that test the evolutionary stability of the wild- 
type values. In humans, somatic cells exhibit mutation rates t hat are 



one t o two orders of magnitude greater than germline cells (jLvnchl . 

This shows that human cells are capable of producing many 
mutations per generations, and makes necessary a treatment that 
can handle multiple mutations in order to analyze the evolutionary 
stability of low mutation rates. 

Q.6. This result holds for populations fixed at the modifier locus. What can 
we expect if the initial population is polymorphic for the modifier? 

A. Here the analytical techniques break down, because there is no clear 
relationship between the variation in transmission produced by a new mod- 
ifier allele and the mean transmission probabilities ([2]) in the population. 
Based on the ubiquity of the reduction result, one can conjecture that a 
form of reduction result will hold, but its exact form requires analysis that 
can handle more general forms of variation in transmission. 

Q.7. Once a new modifier allele successfully invades, what happens then? 

A. The results here are for local perturbations of the equilibrium popu- 
lation, and so do not reveal what happens once a modifier allele invades. 
As the new modifier allele increases in frequency, it obviously changes the 
mutation rates experienced by the loci under selection. If these changes 
are small enough, then Ti^-^d^r- i\ k) in (|^ will change on l y sligh tly, and by 



the 'theory of small parameters' ( Karlin and McGregor , 1972a[ ). the hap 



lotype frequencies of loci under selection will converge to another stable 
equilibrium near the starting stable equilibrium. 

For modifiers with larger effects, however, the original equilibrium can 
potentially become unstable or even disappear. Homotopy continuation 
methods may be of use in elucidating the possibilities here. 

Whatever the population re-equilibrates to after invasion of the modifier 
allele, it is again subject to invasion by additional modifier alleles that 
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reduce the mutation rates below the current neutral manifold of mutation 
rates. 

Q.8. What guess can be made as to how the inclusion of recombination would 
change the results? 

A. The inclusion of mutation makes the model into an example of a 'mixed 
proces s', which is where depar tures from the reduction result have been 



found. iHolsinger and FeldmanI (|19836 ) find that maximal mutation rates 
evolve in a model of pure selling and overdominance — which is a mixed 
process — because selfing drives down the frequency of the fittest geno- 
type, the heterozygote, which high mutation helps to restore. So, could 
mutation restore the frequencies of high fitness genotypes that recombi- 
nation drives down? Such a situation is difficult to imagine, because the 
genotypes that recombination would drive down are overdominant coad- 
apted gene complexes, and it seems unlikely that mutation would help to 
boost the frequency of such complexes. 

As to recombination between the modifier gene and the loci under se- 
lection, it in essence dilutes the subpopulation by mixing in some of the 
equilibrium population. So recombination would expected be to moderate 
the force of selection induced on a new modifier allele, but not to change 
its direction. 

Q.9. What do these results have to say about populations not at equilibrium? 

A. Almost, but not quite, nothing. Populations that are far from equi- 
librium — due to small populations, populations under varying selection, 
populations in transient phases of evolution, and populations evolving with 
novel genotypes — have modifier gene dynamics that are fundamentally 
different from the equilibrium populations considered here. However, at 
some point where the population becomes 'close enough' to equilibrium, 
the near-equilibrium dynamics wil l again take hold. This appears to be 
seen, for example, bv lGiraud erahl (|200ll) in enteric bacterial populations, 



which are far from equilibrium when first colonizing a new host, and evolve 
higher mutation rates, but after some period of time evolved reduced mu- 
tation rates. So at some point with large enough population sizes, slow 
enough variation in selection, damped out transients, or rare enough novel 
genotypes, the results for near-equilibrium models should come to domi- 
nate the dynamics. 

Next, details of additional aspects of the results will be discussed: The 
constraint that mutation be symmetrizable, the multivariate reduction principle, 
the strength of selection on the modifier locus, and models that depart from the 
reduction principle. 

5.2 The Symmetrizable Mutation Constraint 

The constraint that the mutation matrices be symmetrizable is necessary to use 
the Rayleigh-Ritz variational characterization for the spectral radius. It causes 
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all the eigenvalues of and M^D to be real. Since symmetrizable M is the 
transition matrix for a reversible Markov chain, its Perron vector produces 'de- 
tailed balance' ([20)1 . The mathematical tractability of reversible Markov chains 
has led to their widespread used in phylogenetic inference mod els, regardless of 
whether e mpirica l mutation rate s actu ally are symmetriza ble (^Ro driguez et al" * 



1990t IYmls : 1995: Javaswal et al. 2005: Sauartini and Arnd t 2008). 



There is no reason to believe, however, that symmetrizability is fundamental 
to the reducti on result. It is not needed in the general reduction result for lin - 
ear variation (jAltenbergl . Il984t lAltenberg and Feldmanl . Il987l : lAltenberj . l2009l ) . 



In the absence of symmetrizability, the non-Perron eigenvalues may be com- 
plex. Complex eigenvalues correspond to circulating non-zero net flows between 
states. But as discussed in Q|4l, net flows from fitter-than-average haplotypes 
to less-fit-than-average haplotypes are already a part of any mutation-selection 
balance. 

I conjecture that the symmetrizability constraint can be removed, and the 
multivariate reduction result will still pertain. 

5.3 The Multivariate Reduction Principle 

Details of the multivariate reduction principle are now discussed. 

5.3.1 Negative Correlations between Selection and Mutation Rates 

A new feature of this model is that it analyzes modifier loci that individually 
tune the mutation rates different loci. When the marginal fitnesses at equilib- 
rium do not depend on a particular locus, the modifier locus can 'detect' this, 
even in the midst of large complex fitness interactions among the other loci, 
by being able to change the mutation rates at this locus with no effect on the 
modifier allele's survival. 

This means if genetic variation exists for local mutation rates, these rates 
will evolve differently depending on whether the locus is neutral or not. Empir- 
ical stud ies find substantial variation in mutation rates between sites within a 
genome (jBaer et all . 120071: iKing and Kashil l2007t IFox et al.l . |2008| ) . An imphca- 



tion of Theorem [2] is that these differences may be the result different histories 
of selection among loci. In particular, neutral loci do not have the reduction 
force operating on variation for their individual mutation rates, so they may 
evolve higher mutation rates. 

If there were any mechanism that decreased mutations at one location at the 
expense of increasing it at another location, then neutral loci could become a 
'dumping ground' for such negative pleiotropic relations, and would enable their 
partner loci under selection to evolve lower mutati on rates. A potenti al example 
of such pleiotropic interactions is documented bv lHoede et al. who find 



that single-stranded DNA secondary structure reduces mutation rates in E. coli, 
and that such structures are found in excess within heavily transcribed sections 
of DNA. If two sequences A and B were competing to form secondary structure 
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with a third site C, modifier dynamics would favor the evolution of secondary 
structure to protect the sequence A oi B incurring the greatest genetic load. 

The possibility that there is a systemic negative correlation between under- 
lying mutation rates and selection intensity presents a confounding possibility 
for models of base substitution in phylogenetic models. 



5.3.2 The Neutral Manifold of Mutation Rates 



The manifold, Af{fi), of mutation rates that are neutral for a new modifier allele 
is a topological necessity whenever all the loci in dividually exhibit the reduction 
result. The finding in Zhivotovskv et al.l (|l994l ) that a weighted average of the 
recombination rates determines whether the new modifier allele increases or not 
is, in fact, the finding of the neutral manifold in the linear limit. Their manifold 
can be defined by setting to zero their expression: 



s=l t=l 



(57) 



which produces an L{L — 1) — 1 dimensional plane in the L{L — 1) dimen- 
sional space of pairwise recombination rates between L loci. Their manifold is 
a fiat hyperplane, one may infer, as a consequence of the assumptions of weak 
selection with pairwise additive-by-additive epistasis, which eliminates many 
nonlinearities. In the current paper, an explicit formula like ((57)) for the neu- 
tral manifold never appears; the existence and properties of this manifold are 
inferred through topological arguments, purely from the monotonicity and neg- 
ative partial derivatives of the spectral radius 9p(M^D)/9/iK. 

However, an explicit equation for the manifold can be given for small per- 
turbations of /It. Let 



d := Vpifi) = 



refer to the gradient vector. Its value can be computed explicitly (numerically 
if not analytically) if M^ and D are given, using (1591) and (|5^ . since A^'*\ T 
K, and B all derive from M^, and v and p(M^D) derive from M^ and D. 
So for small 5 G , the manifold N{^Ji) is approximated near /x by 

{n + 8: d^6 = 0}. 



(58) 



The entries of d are analogous to the weights Agt in ([57)) . which can be 
inferred to be proportional to t he derivatives of the spect ral radius with respect 
to each recombination rate Vst ■ I Zhivotovskv et al.l (jl994l) point out that ([57)1 is 
not simply a total of all the changes in the recombination rates, but a weighted 
sum whose weights A^t incorporate the intensity of epistasis between loci s 
and t. A simple sum would entail that the derivatives of the spectral radius 
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be all equal, but clearly, the derivatives depend on selection and mutation or 
recombinations distributions in an intricate way. 



_A_little reflection will show that what is found here and in lZhivotovskv et al. 



( 19941 ) is, indeed, the only possible form that a multivariate reduction principle 



could take. A multivariate reduction principle should have, as its simplest re- 
quirement, that when a new modifier allele changes a single variable, it should 
increase if and only if it reduces the value of that variable. Section lTll shows that 
this simple requirement leads, through topological necessity, to the existence the 
neutral surface of mutation rates with its described properties. 

5.4 The Strength of Selection on the Modifier Locus 

It is something of the 'lore' about modifier genes that selection induced on 
them is weak, since a number of part icular cases studied found slow cha nges in 
frequency of the modifier alleles (e.g. Karlin and McGregoij|l9726 . 1974[). With 



weak selection and pairwise additive-by-additive epistasis, iZhivotovskv et al 



(1993) find that selection induced on the modifier allele is quite small and the 



asymptoti c rate of change in t he modifier allele on the order of the square of the 
epistasis. iKondrashov (Il995l) finds that the modifier alleles change frequency 



slowly in a model using various assumptions and approximations to estimate 
the selection induced on a mutation modifier in populations under mutation- 
selection balance. 

But small rates of chan ge are not, in general, a necess ity of modifier gene 



models. As was shown in lAltenberg and FeldmanI (|1987l Result 2b), in the 



extreme case that r] — 0, the asymptotic growth rate of the new modifier allele, 
p(M^D) will equal max^ Wi/w > 1, which has an upper bound of 1/cr, where a 
is the fraction of haplotypes transmitted without change. 

Here, cr — Y[^=ii^ ~ M?) is the fraction of haplotypes that are transmit- 
ted without any mutations. If the number of loci is large, and the values of 
/ij moderate, a can be quite small, and the upper bound 1/cr large. In the 
Poisson approximation discussed in Ql5l above, part [21, the estimate of a in 
humans is ct = Pr[0] = Ve'^/vl = 1.75°e~i-^V0! = 0.17, so 1/cr = 5.7. Thus, 
the upper bound on the strength of induced selection coefficient of a mutation- 
eliminating modifier allele is around 6, which allows very strong induced selec- 
tion on the modifier locus. The actual value that max^ Wi/W takes on depends 
on the specifics of the selection regime and mutation distributions and can be 
substantially less than 1/cr. 

The strength of selection on the new modifier can also be seen to increase 
with several factors: the magnitude of its change on mutation rates (Theorem 
E]), the number of loci whose mutation rates it alters (Corollary |3|), and the 
number of cell divisions from zygote to gamete (Corollary [2]). 
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5.5 Relation to Models that Depart from the Reduction 
Principle 

Departures from the reduction result in near-equilibrium populations have been 
fo und mainly in mod e ls tha t depart from linear variation. The models here and 



Zhivotovskv et al. ( 19941 ) have variation that is not linear, yet they both pro- 



duce the multivariate reduction result. What underlying properties can explain 
this? 

As a way to summ arize the examples of departures from the reduction princi- 
Dle. lAltenberd (fl98lDD. 149,225-228) proposed a 'principle of partial control': 
when the modifier gene has only partial control over the transformation occur- 
ring at loci under selection, then it may be possible for the part it controls to 
evolve an increase in rates. I offered the following speculation: 

If a modifier controls the transformation acting at only one or a 
few loci, then the transformations acting at other loci will render 
the variation at this modifier non-linear. It is conceivable, there- 
fore, that a modifier affecting recombination at only a few loci could 
evolve to increa se that recombina tion when recombination is occur- 
ring elsewhere. (lAltenberd . Ilisl p. 227) 



The above possibility is ruled out by the results in IZhivotovskv et al.l (|l994f ). 
at least for weak selection and pairwise epistasis: even when the modifier has 
only partial control over the recombination events — because it varies only one 
or a few pairwise recombination frequencies — it can only evolve to decrease 
the recombination rates below the neutral manifold. And the same situation 
applies here for mutation rates: any departures from the reduction result due 
to partial control over mutation rates are ruled out. 

One can speculate about what the underlying difference is between these 
models and the models that provided the basis for the principle of part i al con trol. 



namely: r ecombination in the presence of mutat ion ( Feldman et al. . 1983), 



or 



migration dCharleswqrth aiid Charlesworth ^ 1979 ], or segre gation and syngamy 
( Charlesworth et al. . 19791 : Holsinger and Feldmanl. 1983a), or m utation in the 
presence of segregation and syngamy ([Holsinger and Feldmanl . [1983 fe). Each 
of the latter models is a mixed process, in which the modifier locus controls 
one among multiple transformation processes that differ in their mathematical 
struct ure. In the current paper , there is only one type of process — mutation, 
and in lZhivotovskv et al.l (|l994l ). there is only recombination. Multiple instances 
of linear variation for only one kind of process are compounded together to 
produce the variation in transmission of the entire genome, which is nonlinear. 

One may wonder whether it is the independent occurrence of multiple events 
that produces the reduct ion result. Here, multipl e mutations occur indepen- 
dently. But the model of I Zhivotovskv et al.l ()l994l ) does not assume that mul- 
tiple recombination events are independent, and can accommodate arbitrary 
interference patterns. So, the independence of events is not essential to the 
reduction results observed. 
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These two models of non-linear variation that preserve the reduction result 
do share the following: they use multiple instances of homogeneous genetic 
processes to built up a multilocus, multivariate model. Is this the key to their 
preservation of the reduction result? Lest one surmise that it is the homogeneity 
of processes that is th e underlying feat ure that produces the reduction result, the 



following model from lAltenbera (|1984 pp. 149-151) provides a counterexample. 

The model posits a single locus upon which two different m utation processes 
act s equentially. Each process is 'House of Cards' mutation (jKingma 1 Il978l 



where the mutation distribution matrix, P'^*\ for each process i, is a rank 
one matrix: P^*) = Tr^^'e^. The modifier gene has linear control over one of the 
two processes, and varies either /ii or /i2 in the expression, 

Mp,,^, = [(1 - /ii)I + A*i7r(i)e^][(l - ^^2)! + ^^27^^^^e'^]. (59) 

One can craft values for the variables that violate the reduction result: if tt*^^' 
is weighted towards the least fit haplotypes, while tt^^^ is weighted toward the 
most fit, and /zi is small while ^2 is large, then a modifier which shifts its 
subpopulation toward the fitter haplotypes by increasing mutation rate /ii will 
increase when rare, provided the variables are in the right ranges (e.g. for two 
alleles, wi > W2, fJ-i = 0.1, fj.2 = 0.4, tt^^^ = 0.9, and tt^^ — 0.1). 

When these two processes P^^-' and P*^^'' act on two different loci, however, 
they can no longer interact in the same way. The mutation matrix then becomes: 

M^i,^, = [(l-Aii)I®I + Mi(7r(''e^)®I][(l-Ai2)I®I + M2(I'»7r(2)eT)]. (60) 

Indeed, since P^^' and P^^^ are symmctrizable, (|60p is simply an instance of 
(PT|) . the model analyzed here, so the reduction results of Theorem [2] apply. So, 
we see that the reduction principle applies to nonlinear variation of the form 
((60|) . but not of the form ((59|) . The only difference between them is that the 
two mutation process have a single target in ([59|) . but separate targets in ([60]) . 

The picture that emerges is that when mixed processes are acting on the 
same set of loci, the expansion of one process can sometimes systematically shift 
the population toward the fitter genotypes, and cause modifiers that support 
this expansion to survive. This is the essence of th e deterministic mu tation 
hypothesis for the evolution of sex and recombination ( Kondrashov . 1982h . The 



theoretical question then becomes, how do we identify which combinations of 
processes and conditions on selection will produce this effect? 

One can make a wild conjecture at this point: that in all of the cases of 
modifier models where a mixing of forces produces departures from the reduction 
principle, then a 'separation of forces' into linear variation on separate loci — 
provided it is feasible to follow a form similar to going from ([59| to (|60|) — will 
restore the reduction result. Evaluation of this conjecture is deferred to future 
work. 
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